Stable diffusionのimg2imgをGTX1070（VRAM 8GB）で使う

from stable diffusionで下絵から画像を出す

img2img(stable diffusion)を動かしたい

Stable diffusionでVRAMが10GB未満のGPUをつかっているならfloat16 precisionのパイプラインを使うわけで、こちらもfloat16 precisionにしたい

コードを次のように変更したら動いた

code:diff

diff --git a/scripts/img2img.py b/scripts/img2img.py

index 421e215..1a4f3ba 100644

--- a/scripts/img2img.py

+++ b/scripts/img2img.py

@@ -49,9 +49,17 @@ def load_img(path):

image = Image.open(path).convert("RGB")

w, h = image.size

print(f"loaded input image of size ({w}, {h}) from {path}")

- w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32

+ ar = w/h

+ if(w > h):

+ h = 512

+ w = int(ar*h)

+ else:

+ w = 512

+ h = int(w/ar)

+ w, h = map(lambda x: x - x % 16, (w, h)) # resize to integer multiple of 16

+ print(f"resized image of size ({w}, {h})")

image = image.resize((w, h), resample=PIL.Image.LANCZOS)

- image = np.array(image).astype(np.float32) / 255.0

+ image = np.array(image).astype(np.float16) / 255.0

image = imageNone.transpose(0, 3, 1, 2)

image = torch.from_numpy(image)

return 2.*image - 1.

@@ -198,6 +206,7 @@ def main():

config = OmegaConf.load(f"{opt.config}")

model = load_model_from_config(config, f"{opt.ckpt}")

+ # https://github.com/CompVis/stable-diffusion/issues/71#issuecomment-1224699343

+ model.half()

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model = model.to(device)

入力画像を512x512にして通ることを確認した

注意：10-18行目の解像度変更は、書いたはいいものの使えるのか検証していない

解説

https://github.com/CompVis/stable-diffusion/issues/71#issuecomment-1224699343

model.half()をつかうとfloat16にできるこれでallocateしに行く量が半分になる（2.64->1.32）

code:zsh

RuntimeError: CUDA out of memory. Tried to allocate 1.32 GiB (GPU 0; 8.00 GiB total capacity; 6.05 GiB already allocated; 0 bytes free; 6.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

512x800ぐらいだとこれでも足りなかった

入力ソースを512x512にしたら通った

--.icon

試行錯誤のログ

2715x1642の画像を読んだらVRAM不足になった

code:zsh

RuntimeError: CUDA out of memory. Tried to allocate 4.18 GiB (GPU 0; 8.00 GiB total capacity; 4.09 GiB already allocated; 2.40 GiB free; 4.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

縮小処理をして512ベースにしてみたが、だめだった

code:zsh

RuntimeError: CUDA out of memory. Tried to allocate 2.64 GiB (GPU 0; 8.00 GiB total capacity; 5.39 GiB already allocated; 406.09 MiB free; 5.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF